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O ' Abstract 

o : 

^ We devise a new embedding technique, ■which we call measured descent, based on decompos- 

ing a metric space locally, at varying speeds, according to the density of some probability mea- 
^ I sure. This provides a refined and unified frame'work for the t"wo primary methods of constructing 

■ Frechet embeddings for finite metrics, due to [Bourgain, 1985] and [Rao, 1999]. We prove that 

any n-point metric space {X,d) embeds in Hilbert space -with distortion 0{\/ax • logn), -where 
ax is a geometric estimate on the decomposability oi X. As an immediate corollary, -we ob- 
tain an 0{y^ (log Ax ) log n) distortion embedding, -where Xx is the doubling constant of X. 
Since Xx < n, this result recovers Bourgain's theorem, but -when the metric X is, in a sense, 
CO ' "lo-w-dimensional," improved bounds are achieved. 

. Our embeddings are volume-respecting for subsets of arbitrary size. One consequence is the 

I existence of (fc,0(logn)) volume-respecting embeddings for all 1 < fc < n, -which is the best 

O ■ possible, and answers positively a question posed by U. Feige. Our techniques are also used to 

ans-wer positively a question of Y. Rabinovich, sho-wing that any -weighted rt-point planar graph 
^SJ ' embeds in -with 0(1) distortion. The O(logn) bound on the dimension is optimal, and 

] improves upon the previously kno-wn bound of 0((logn)^). 

OO 
O 

^ ! 1 Introduction 

tJ" . The theory of low-distortion embeddings of finite metric spaces into normed spaces has attracted 



OO 



O 



a lot of attention in recent decades, due to its intrinsic geometric appeal, as -well as its applications 



^ ' in Computer Science. A major driving force in this research area has been the quest for analogies 

bet-ween the theory of finite metric spaces and the local theory of Banach spaces. While being 
very successful, this point of vie'w did not al^ways result in satisfactory metric analogues of basic 
^ ' theorems from the theory of finite-dimensional normed spaces. An example of this is Bourgain's 

■ embedding theorem [3], the forefather of modern embedding theory, -which states that every n-point 

metric space embeds into a Euclidean space -with distortion O(logn). This upper bound on the 
distortion is kno-wn to be optimal [2Z1- Taking the point of view that logn is a substitute for the 
dimension of an n-point metric space (see 0; this approach is clearly natural -when applied to a 
net in the unit ball of some normed space), an analogue of John's theorem |17j -would assert that 
n-point metrics embed into Hilbert space ■with distortion O(ylogn). As this is not the case, the 
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present work is devoted to a more refined analysis of the Euclidean distortion of finite metrics, and 
in particular to the role of a metric notion of dimension. 

We introduce a new embedding method, called measured descent, which unifies and refines the 
known methods of Bourgain [5] and Rao for constructing Frechet-type embeddings (i.e. em- 
beddings where each coordinate is proportional to the distance from some subset of the metric 
space). Our method yields an embedding of any n-point metric space X into £2 with distortion 
0{^/aJ^iogrl), where ax is a geometric estimate on the decomposability of X (see Definition 11.31 
for details). As ax < 0{logn), we obtain a refinement of Bourgain's theorem, and when ax 
is small (which includes several important families of metrics) improved distortion bounds are 
achieved. This technique easily generalizes to produce embeddings which preserve higher dimen- 
sional structures (i.e. not just distances between pairs of points). For instance, our embeddings 
can be made volume-respecting in the sense of Feige (see Section [1.2(1 . and hence we obtain optimal 
volume-respecting embeddings for arbitrary n-point spaces. 

Applications. In recent years, metric embedding has become a frequently used algorithmic tool. 
For example, embeddings into normed spaces have found applications to approximating the sparsest 
cut of a graph El ^ and the bandwidth of a graph ^1 |H], and to distance labeling schemes 
(see e.g. |16| Sec. 2.2]). The embeddings introduced in this paper refine our knowledge on these 
problems, and in some cases improve the known algorithmic results. For instance, they immediately 
imply an improved approximate max-flow/min-cut theorem (and algorithm) for graphs excluding a 
fixed minor, an improved algorithm for approximating the bandwidth of graphs whose metric has 
a small doubling constant, and so forth. 

1.1 Notation 

Let {X,d) be an n-point metric space. We denote by B{x,r) = {y £ X : d{x,y) < r} the open 
ball of radius r about x. For a subset S C X, we write d{x,S) = miny^s d{x,y), and define 
diam(S') = maxx^y^s d{x,y). We recall that the doubling constant of X, denoted \x, is the least 
value A such that every ball in X can be covered by A balls of half the radius ^23,. ^ 1291 115j . We say 
that a measure /i on X is non- degenerate if fJ,{x) > for all x £ X. For a non-degenerate measure 
on X define $(/i) = max^^gx fJ-i^) / fJ-i^) to be the aspect ratio of /i. 
Let {X,dx) and (1", dy) be metric spaces. A mapping f : X ^ Y is called C-Lipschitz if 
dy if {x) , f (y)) < C ■ dx{x,y) for all x,y G X. The mapping / is called i^-bi-Lipschitz if there 
exists a C > such that 

CK~^ ■ dx{x,y) < dY{f{x)J{y)) < C ■ dx{x,y), 

for all x,y G X. The least K for which / is i^-bi-Lipschitz is called the distortion of /, and is 
denoted dist(/). The least distortion with which X may be embedded in Y is denoted cy (X). When 
Y = Lp we use the notation cy(-) = Cp(-). Finally, the parameter C2{X) is called the Euclidean 
distortion of X. 

Metric Decomposition. Let {X, d) be a finite metric space. Given a partition P = {Ci, . . . , Cm} 
of X, we refer to the sets Ci as clusters. We write Vx for the set of all partitions of X. For x £ X 
and a partition P £ Vx we denote by P{x) the unique cluster of P containing x. Finally, the set 
of all probability distributions on Vx is denoted Vx ■ 
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Definition 1.1 (Padded decomposition). A (stochastic) decomposition of a finite metric space 
{X, d) is a distribution Pr G T>x over partitions of X. Given A > ande : X — > (0, 1], a A-bounded 
e-padded decomposition is one which satisfies the following two conditions. 

1. For all P G supp(Pr), for all C e P, diam(C) < A. 

2. For all xeX, PT[B{x,e{x)A) ^ P{x)] < i. 

We will actually need a collection of such decompositions, with the diameter bound A > 
ranging over all integral powers of 2 (of course the value 2 is arbitrary). 

Definition 1.2 (Decomposition bundle). Given a function e : X x Z ^ (0, 1], an e-padded 
decomposition bundle on X is a function /3 : Z ^ T^x, where for every n G l3{u) is a 2^-bounded 
£{-,u)-padded stochastic decomposition of X. 

Finally, we associate to every finite metric space an important "decomposability" parameter 
ax- (See ^] for relationships to other notions of decomposability.) 

Definition 1.3 (Modulus of padded decomposability). The modulus of padded decomposabil- 
ity of a finite metric space (X, d) is defined as 

ax = inf{a : there exists an e-padded decomposition bundle on X with e{x,u) = 1/a}. 

It is known that ax = O(logn) [Ml 1^5 furthermore ax = O(logAx) ^M- Additionally, 
if X is the shortest-path metric on an n-point constant-degree expander, then ax = O(logn) 
For every metric space X induced by an edge-weighted graph which excludes Kr^r as a minor, it is 
shown in the sequence of papers that ax = 0{r'^). 

Volume-respecting embeddings. We recall the notion of volume-respecting embeddings, which 
was introduced by Feige ^21 as a tool in his study of the graph bandwidth problem. Let C X 
be a fc-point subset of X. We define its volume by 

vol(S') = sup{volfc_i(conv(/(S'))) : f : S ^ L2 is 1-Lipschitz}, 

where for A C L2, conv(A) denotes its convex hull and the (k — l)-dimensional volume above is 
computed with respect to the Euclidean structure induced by L2. A mapping / : X ^ L2 is called 
(/c, 7y)-volume-respecting if it is 1-Lipschitz and for every fc-point subset 5 C X, 

vol{S) 
volA;_i(conv(/(5'))) 

It is easy to see that a 1-Lipschitz map f : X ^ L2 has distortion D if and only if it is (2, D)- 
volume-respecting. Thus the volume-respecting property is a generalization of distortion to larger 
subsets. 

1.2 Results 

The following theorem refines Bourgain's result in terms of the decomposability parameter. 



< rj. 
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Theorem 1.4 (Padded embedding theorem). For every n-point metric space {X, d), and every 
1 < p < oo, 

Cp(X)<0(4-^/^(logn)VP). (1) 

The proof appears in Sections 11.31 and l2l Since ax = O (log Ax), it impHes in particular 
that C2{X) < 0(y^(log Ax) • log |^|) , for any metric space X. This refines Bourgain's embedding 
theorem jSj, and improves upon previous embeddings of doubling metrics jl4| . It is tight for 
Ax = 0(1) [2111221111, and Ax = n'^(^) j2Zj. The question of whether this bound is tight up to a 
constant factor for the range Ax G {ci, . . . , where ci G N, < C2 < 1 are some constants, is 

an interesting open problem. 

For 1 < p < 2, the bound 0{y/ax logn) is better than (Q), and thus, in these cases, it makes 
sense to construct the embedding first into L2. 

A more careful analysis of the proof of Theorem 11.41 yields the following result, proved in 
Section [2.21 which answers a question posed by Feige in jl2j . 

Theorem 1.5 (Optimal volume-respecting embeddings). Every n-point metric space X ad- 
mits an embedding into L2 which is {k, 0{y/ ax logn)) volume-respecting for every 2 < k < n. 

Since ax = O(logn), this provides (A;, 0(log n))-volume-respecting embeddings for every 2 < 
k < n. This is optimal; a matching lower bound is given in PU] for all k < n^/^. We note that the 
previous best bounds were due to Feige TT, who showed that a variant of Bourgain's embedding 
achieves distortion ©(yTogn- \/log n + k log k) (note that this is Vl{y/n) for large values of k)^ and 
to Rao who showed that 0((logn)'^/^) volume distortion is achievable for all 1 < A; < n (this follow 
indirectly from [231, and was first observed in jl.Sj). 

This also improves the dependence on r in Rao's volume-respecting embeddings of ii'r,r-excluded 
metrics, from {k,0{r'^^/\ogn)), due to [SSII^, to (/c, 0(r\/log n))} As a corollary, we obtain an 
improved 0{r^J\og n)- approximate max-flow/min-cut algorithm for graphs which exclude K^^r as 
a minor. 

^00 embeddings. It is not difficult to see that every n-point metric space {X, d) embeds isometri- 
cally into via the map y ^ {d{x, y)}j,.gx- ^"^^ some spaces, like the shortest-path metrics on 
expanders, or on the log n-dimensional hypercube (see, e.g. p5|), it is known that n^^^^^ dimensions 
are required to obtain any map with 0(1) distortion. On the other hand, a simple variant of Rao's 
embedding shows that every planar metric 0(l)-embeds into ^^'■'•^"^"^ Thus the dimension re- 
quired to embed a family of metrics into (.00 with low distortion is a certain measure of the family's 
complexity (see 

In Section |2] we use a refinement of measured descent to prove the following theorem, which 
answers positively a question posed by Y. Rabinovich j32j, and improves Rao's result to obtain the 
optimal bound. 

Theorem 1.6. Let X he an n-point edge-weighted planar graph, equipped with the shortest path 
metric. Then X embeds into with 0(1) distortion. 

The O(logn) bound on the dimension is clearly optimal (by simple volume arguments). Fur- 
thermore, this result is stronger than the 0{^J\ogn) distortion bound on Euclidean embeddings of 

^This bound is tight for the path even for fc = 3, see |51l2U|. 
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planar metrics, due to Rao [SSI- The embedding is produced by "derandomizing" both the decom- 
position bundle of [33 E] and the proof of measured descent (applied to this special decomposition 
bundle) . 

1.3 The main technical lemmas 

The following lemma is based on a decomposition of [H], with the improved analysis of [3 El- The 
extension to general measures was observed in [25 . Since this lemma is central to our techniques, its 
proof is presented in Section |2l for completeness. Throughout, logx denotes the natural logarithm 
of X. 

Lemma 1.7. Let {X,d) be a finite metric space and let fi be any non- degenerate measure on X. 
Then there exists an e(x,u)-padded decomposition bundle on X where 



e{x, u) 



fi{B{x,2-)) ^ ' 



(2) 



Remark 1.1. In \cl4^ it was shown that X admits a doubling measure, i.e. a non- degenerate 
measure fi such that for every x G X and every r > we have ^^^^^^ly^ = ^x*^^^ ■ thus recover 
the fact, first proved in that for every metric space X, ax = O(logAx)- In particular, for 
every d-dimensional normed space Y, ay = 0{d). In it is argued that ay = ^{d) when Y = if. 
The same lower bound was shown to hold for every d dimensional normed space Y in 126}/ . 

The main embedding lemma. Let {X, d) be a finite metric space, and for e : X x Z ^ R define 
for all x,y ^ X, 

=min|e(a;,n) : wGZ and < 2" < ^^i^ 

Given a non-degenerate measure /i on X denote for x, y € X: 

V,ix. y) = -^-[^-^ ^^BixAx.y)r.l2)y '""^ vVoVl)) ] ^ 

In what follows we use the standard notation cqo for the space of all finite sequences of real 
numbers. The following result is the main embedding lemma of this paper. 

Lemma 1.8 (Main embedding lemma). Let X be an n-point metric space, /i a non- degenerate 
measure on X, and (3 : ^ T>x o,n e{x,u)-padded decomposition bundle on X. Then there exists 
a map ip : X ^ cqo such that for every 1 < p < oo and for all distinct x,y £ X , 

[V,{x,yt/P.min{6,{x,y),6,{y,x)} < "^^^]~ <C[logHf,)fr 

Here C is a universal constant. 

Using Lemma 11.81 we are in a position to prove Theorem 11.41 We start with the following 
simple observation, which bounds V^(x,y) from below, and which will be used several times in 
what follows. 
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Lemma 1.9. Let fi be any non- degenerate measure on X and x,y £ X, x ^ y. Then 

/ fiiB{x,2dix,y))) i2{B{y,2d{x,y))) \ ^ 
^^'^ 1 f^iBix, dix, y)/2)) ' ^^{B{y, d{x, y)/2)) / " ' 

Proof. Assume without loss that ii{B{x,2d{x,y))) < fx{B{y,2d{x,y))). Noticing that the two balls 
B{x,d{x,y)/2) and B{y,d{x,y)/2) are disjoint, and that both are contained in B{x,2d{x,y)); the 
proof follows. □ 



Proof of Theorem \1.4\ Fix p £ [1,00] and let ^ = | • | be the counting measure on X. Let £{x,u) 
be as in @, and observe that in this case for all x,y £ X we have 6£{x,y) > [16 + lGV^{x,y)]~^ . 
Applying Lemma 11.81 to the decomposition bundle of Lemma 11.71 we get a mapping ipi : X ^ Lp 
such that for all X, y G X, 

16 + 16V^{x,y)- d{x,y) -^^^g^^ • 

On the other hand, Lemma 11.81 applied to the decomposition bundle ensured by the definition of 
ax yields a mapping Lp2 : X ^ Lp for which 

ax d{x,y) 
Finally, for = (^1 © 932 we have 

\\ip{x) - ip{y)\\Z V^{x,y) Vf,{x,y) 



dix,y)P - [16 + 16V^{x, y)]P " \^a^~7 ' 

where we have used the fact that Lemma ll . 91 implies that V^(x, y) > $7(1). □ 



2 Measured descent 

In this section, we prove the main embedding lemma and exhibit the existence of optimal volume- 
respecting embeddings. We use decomposition bundles to construct random subsets of X, the 
distances from which are used as coordinates of an embedding into cqo- As the diameters of the 
decompositions become smaller, our embedding "zooms in" on the increasingly finer structure of 
the space. Our approach is heavily based on the existence of good decomposition bundles; we thus 
start by proving Lemma 1 1.71 which is essentially contained in |lflj . 

Proof of Lemma \l.T\ By approximating the values {fi{x)}x<^x by rational numbers and duplicating 
points, it is straightforward to verify that it is enough to prove the required result for the counting 
measure on X, i.e. when ij-{S) = \S\. 

Let A = 2" for some n G Z. We now describe the distribution P{u). Choose, uniformly at 
random, a permutation tt of X and a value a G [j, For every point y £ X, define a cluster 

Cy = B{y,aA)\ \J B{z,aA). 

z:iT(z)<n{y) 
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In words, a point x E X is assigned to Cy where y is the minimal point according to vr that is 
within distance qA from x. 

Clearly the set P = {Cy}yi^x constitutes a partition of X. Furthermore, Cy C B{y,aA), thus 
diam(Cj^) < A, so requirement (1) in Definition 11.11 is satisfied for every partition P arising from 
this process. It remains to prove requirement (2). 

Fix a point x G X and some value t < A/8. Let a = \B{x, A/8)\, b = \B{x,A)\, and arrange 
the points wi, . . . ,Wh G B{x, A) in increasing distance from x. Let 1^ = [d{x, Wk) — t, d{x, Wk) + t] 
and write £k for the event that aA < d{x,Wk) + t and Wk is the minimal element according to 
vr such that aA > d{x,Wk) — t. Note that if Wk G B{x, A/8), then Pr[iSfc] = since in this case 
d{x,Wk) + t < A/8 + t < A/4 < aA. We claim that the event {B{x,t) ^ P{x)} is contained in the 
event Ufe=i'^fc- Indeed, let Wj be the minimal element according to vr such that C^j H B{x,t) 7^ 0. 
It follows that d{wj,x) < aA + t. Furthermore, d{x,Wj) > aA — t, since otherwise B{wj,aA) 5 
B{x,t), implying that P{x) = Cyj. 5 B{x,t). Hence there exists k for which the event occurs. 

Now 

b b 

Pv[B{x, t) ^ P(x)] < PT[£k] = ^ ^^'l • ^'^^^ I ^ 

k=a+l k=a+l 



< 



2t 1 8t / , b\ 
^ ATi-k-AV^'^^a)^ 

k=a+l ' ^ 



where we used the fact that Pr[<Sfc | aA G Ik] < Pr[j < k =^ 7r(j) > 7r{k)] = l/k. Setting 
t = e{x, u)A < A/8, where e(x, u) is as in Q, the righthand side is at most ^, proving requirement 
(2) in Definition O □ 



2.1 Proof of main embedding lemma 

We first introduce some notation. Define two intervals of integers I, T C Z by 

/ = {-6, -5, -4, -3, -2, -1,0, 1,2, 3} and T = {0, 1, . . . , [loga }. 

For t > write K{x,t) = max{K G Z : fi{B(x,2'^)) < 2*}. For each u G Z let Pu be chosen 
according to the distribution P{u). Additionally, for « G Z let {cTu{C) : C C X} be i.i.d. symmetric 
{0, l}-valued Bernoulli random variables. We assume throughout the ensuing argument that the 
random variables {au{C) : C Q X, u £ Z}, {Pu : ti G Z} are mutually independent. For every 
t £ T and i £ I define a random subset Wf C X by 

Wi = {x£X : a^(^^^tyi{P^^^^ty,{x)) = 0}. 

Our random embedding f : X ^ cqq is defined by f{x) = {d{x,Wf) : i £ I, t £ T). In the 
sequel, we assume that p < 00; the case p = 00 follows similarly. Since each of the coordinates of 
/ is Lipschitz with constant 1, we have for all x,y £ X, 

Wfix) - fiyWp < \I\ ■ \T\ ■ d{x,yr < 50[log$(^)] . (4) 

The proof will be complete once we show that for all x, y G X 

nfix) - fiyWp> MT^HSe{x,y),d,{y,x)} ■ d{x,yW ■V^{x,y). (5) 
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Indeed, denote by {Q, Pr) the probability space on which the above random variables are defined, 
and consider the space Lp{Q, cqo), i-e. the space of all cqo valued random variables C on equipped 
with the Lp norm ||C||p = (I^llClIp)"^'^^- Equations @ and © show that the mapping x f{x) 
is the required embedding of X into Lp(r2,coo)- Observe that all the distributions are actually 
finitely supported, since X is finite, so that this can still be viewed as an embedding into cqo- See 
Remark 122] below for more details. 



To prove (|Sl)fixx,yGX,X7^y. Without loss of generality we may assume that the maximum 

. tJi(B{x,2d{x,y))) . i,(B{y,2d{x,y))) 
' ii{B(x4{x,y)/^l2)) - fj,{B{y,d{x,y)/512)) 



in © is attained by the first term, namely, „rifid'fS/5i2)) ^ SBfvA?xi^)f5U)) • Using Lemma ESI 



it immediately follows that 

^l{B{x,2d{x,y))) ^ 2 

li{B{x,d{x,y)/ 512)) - ■ ^ ' 

Setting R = jd{x, y), denote for i £2,, Si = log2 fi{B{x, 2*i?)). We next extend some immediate 
bounds on K{x,t) (in terms of R) to any nearby point z G B{x, R/256). 

Claim 2.1. For i e I and allteZn [si-i,Si], every z € B{x, R/256) satisfies f < 2''(^'*)-* < ^. 
Proof. By definition, fi 2'"(^'*))) < 2* < ^ 2'"(^'*)+i)) . For the upper bound, we have 

/i (^S(x,2«(^'*) - R/256)^ < fi (^5(z,2''(^'*))) < 2* < 2^' = j^i {B{x,2'R)) , 

implying that 2'^^^'*) — 2^ < 2*i?, which yields 2'*(^'*)~* < For the lower bound, we have 

H (^B{x, 2''(^'*)+i + i?/256)) > fi 2'^(^'*)+i)) > 2* > 2^-^ = ^ 2'"ii?)) . 

We conclude that 2'^(^'*)+i + ^ >2'-^R, which implies that | < 2'^(^'*)-\ □ 
Consider the following events 

1. £1 = {d{x,X\Puix)) > 5eix,y)^ for all u G Z with 2" G [i2/8, 5i?/4]}, 

2. £2 = WuiPuix)) = 1 for all n e Z with 2" e [R/8, 5i?/4]}, 

3. ^3 = {auiPuix)) = for all n G Z with 2" G [ii/8, 5i?/4]}, 

4. f];js^{%,H^/)> 5i2'^,(x,y)i?}, 

5. ff,r" = {%,^*)< 5T2'^.(^,y)^}- 

The basic properties of these events are described in the following claim. 
Claim 2.2. The following assertions hold true: 

(a) . Pr[fi],Pr[f2],Pr[£:3] >2-4. 

(b) . For all i £ I and t G Z H [si-i, Sj], the event £3 is independent of £^^f. 

(c) . For all i ^ I and t £ IjCi [si-i, Sj], the event £2 is independent of £1 Pi £f^^^^. 

(d) . For all i £ I and t G Z H [si-i, Si], if the event £3 occurs then x G . 

(e) . For all i £ I and t £'Lr\ [sj-i, Sj], «/ i/ie eweni <fi n £2 occurs then d{x, Wl) > ^6£{x, y)R. 
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Proof. For the first assertion, fix u such that 2" G [R/8,5R/4\. Since 5e{x,y) < e{x,u), and Pu 
is chosen from f3{u) which is £(2;, n)-padded, Fv[d{x,X \ Pu{x)) > 6s{x,y)2^] > \. In addition, 
Pr[(T„(P„(x)) = 1] = Vv[au{Pu{x)) = 0] = 1/2. Furthermore, the number of relevant values of u in 
each of the events £2., E-z is at most four for each event, and the outcomes for different values of 



u are mutually independent. This implies assertion (a) 

To prove the second and third assertions note that for 2" € [i?/8, 5ii/4], we always have 
diam(Pu(x)) < ^ < ^ y), and thus Puix) 7^ Pu{y)- Furthermore, for every z G -^^eix, y)R), 
we always have d{x,z) > 3R > diam(P„(j;)) + diam{pu{z)) , thus Puix) / Pui^), and the choices 
of (7„(Pu(x)) and (T„(P„(z)) are independent. It follows that the value of o"„(Pu(x)) is independent 



of the data determining whether d{y,Wf) < ■^6£{x,y)R, which proves [(b)j Assertion (c) follows 
similarly observing that au{Pu{x)) is independent also of the value of d{x,X \ Pu{x)). 

To prove the last two assertions fix i £ I and t G Z n [sj_i, Si]. An application of Claim ETTl to 
z = x shows that 2''(^'*)~* G [R/8, 5R/A]. Now, by the construction of W^, if occurs then x G W^; 
this proves assertion |(d)| Finally, fix any z £ B (^x, ^6s{x, y)R) ■ Since z G B{x, R/256), Claim ITT] 
implies that 2'^(^'*)"* G [R/8,5R/4]. The event £1 implies that for u = K{z,t) - i, Puix) = Pu{z), 
and thus o-^(^,t)_i(PK(^,t)-j(2)) = crK{z,t)-i{PK{z,t)-iix)). Now £2 implies that the latter quantity is 



1, and hence z ^ W^. Assertion (e) follows. □ 



We can now conclude the proof of Lemma ll. 81 Fix i G / and t G [sj-i, Sj]. By assertions | (d) | and 



^ if either of the (disjoint) events £3 D E^f and n n ff^^" occurs then \d{x, Wf) - d{y, W/) | > 



■'i,t "^^^ '-'i-' "-'^ ' "-'i,t 

^6e{x,y)R. The probability of this is Pr[f 3] •Pr[£']^jS]^Pr[£:2]-Pr[£:inf^?^^ii] > 2-^Fi[£i] = 
where we have used assertions (a)| (b) and |(c)[ and the fact that £]^l^^ {£ir]£ff^^^) ^ £1. It follows 



that E\d{x,Wl) - d{y,WI)\'P = [n{6eix,y) ■ d{x,y))]P, and hence 

E\\f{x)-f{y)\\P > E E nd{x,Wi)-d{y,WiW 

i&i tezn[si_i,Si] 

3 

> [n{6eix,y) ■ d{x,y))]P ^ \Z n [si-i, Si]\ 

i=-6 

> [n{6,{x,y)-d{x,yW-'-^^^ (7) 

> [ni6s{x,y)-d{x,yW-V^{x,y), 

where in Q we used the fact that © implies that S3 — s_7 > 1. 

This completes the proof of Lemma 11.81 □ 

Remark 2.1. The above proof actually yields an embedding (p, such that for all x,y £ X satisfying 

m, 



■ p{B{x,2d{x,y))) 
log- 



fi{B{x,d{x,y)/512)) 

Remark 2.2. // in the above proof we use sampling and a standard Chernoff bound instead of taking 
expectations, we can ensure that the embedding takes values in M.^ , where k = 0[(logn) log <I>(/i)]. 
(This is because the lower bound on '&\d{x,Wl) — d{y,Wl)Y' relies on an event that happens with 
constant probability, similar to \2'I^ .) In particular, when /U is the counting measure on X we get 
that k = 0[(logn)^]. R would be interesting to improve this bound to k = O(logn) (if p = 2 then 
this follows from the Johnson-Lindenstrauss dimension reduction lemma fTS^)- 
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2.2 Optimal volume-respecting embeddings 



Here we prove Theorem 11.51 Let / : X 

and g 



?2 be the embedding constructed in the previous section 



// y^501og so that g is 1-Lipschitz. For concreteness, we denote by (ri,Pr) the 

probabihty space over which the random embedding g is defined. 

Lemma 2.3. Fix a subset Y X , x ^ X \ Y and let i/q £ Y satisfy d{x,Y) = d{x,yQ). Let 
Z be any £2 valued random variable on (17, Pr) which is measurable with respect to the a- algebra 
generated by the random variables {g{y)}yeY ■ Then 



y^E\\Z 



d{x,Y) 



> C ■ ds{x,yo) 



1 



log$(^) 



log 



/ fi{B{x,2d{x,yo))) \ 
{fi{B{x,d{x,yo)/512))) 



where C is a universal constant. 



Proof. We use the notation of Section ITTl Denote R = \d{x, yo) and write Z = {Zl : i £ I, t £ J). 
Consider the events £^f = {Zf > ^6e{x,yo)R} and Sf^^^ = {Zf < ^6eix,yo)R}. Arguing as 
in Section 12. H it is enough to check that £3 is independent of S^f' and that £2 is independent 



of £1 n £f^^^^. Observe that the proof of assertions |(b)| and (c) in Claim 12.21 uses only the fact 



that d{x,y) > AR (when considering z S B{y, ■^6e{x,y)R)), and this now holds for all y € Y. 
Since we assume that Z'l is measurable with respect to {d{y, Wf)}y,zY, the required independence 
follows. □ 

We remark that we will apply Lemma 12.31 to random variables of the form Z = ^yi^y '^ydiy)^ 
where the Cy's are scalars. However, the same statement holds for Z's which are arbitrary functions 
of the variables {g{y)}y^Y- 

The following lemma is based on a Bourgain-style embedding. 

Lemma 2.4. Define a subset S Q X by S = {x ^ X : \B{x, R)\ < e\B{x, R/2)\}. Then there exists 
a 1-Lipschitz map F : X ^ L2 such that if x £ S and Y C X with d{x, Y) > R, then 



dL2 (F(x),span{F(y)}yey) > 



C'R 
\/log 



n 



where C > is a universal constant. 



Proof. For each t £ {1, 2, . . . , [logn]}, let Wt C X be a random subset which contains each point 
of X independently with probability e~*. Let gt{x) = mm{d{x,Wt), R/4:} and define the random 
map / 



51 



••• ® a\iogn\) so that ll/llLip < 1- Finally, we define F : X ^ L2{i>) by 



F{x) 



V [log n\ 

f{x), where ly is the distribution over which the random subsets {Wt} are defined. 

,t+i 



Now fix X G 5 and let t e N be such that e* < \B{x,R/2)\ < e*+^. Let £f^j. be the event 
{d{x,Wt) > -R/4} and let <f^ciose be the event {d{x,Wt) < R/8}. Clearly both such events are in- 
dependent of the values {gt{y) '■ d{x,y) > R} ^ {gt{y)}yi^Y (this relies crucially on the use of 
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min{-, i?/4} in the definition of gt). Fixing Cj^ G M for each y €£Y, we see that 

2 / \ 2 



L2H 



^u\f{x)-Y.Cyf{y) 

y& 



> 



- — I gt{x) - V Cygtiy) ] 



210 



i I • mm 



logn 



{z^(£'far),l^(<?closo)} 



Finally, we observe that by the definition of S, i^{£iar) and i^(<?ciose) can clearly be lower bounded 
by some universal constant. □ 



Proof of Theorem M.,51 Using the notation of Lemma [2..S[ consider the Hilbert space H = L2{0,, £2), 
i.e. the space of all square integrable £2 valued random variables on equipped with the Hilbertian 
norm \\C\\2 = "v/^IKlIf- Defining G : X ^ H via G{x) = g{x), Lemma 12.31 implies that for every 
Y C X and X e X\Y, 



dH{G{x),span{{G{y)}y(zY)) 
d{x,Y) 



> C ■ Ssix,yo) 



1 



log$(^) 



/ fi{B{x,2d{x,yo))) \ 
_''^{f,iB{x,d{x,yo)/512))J 



(8) 



We now argue as in the proof of Theorem 11.41 Let Hi, H2 be Hilbert spaces and Gi : X ^ Hi, 
G2 '■ X ^ H2 be two 1-Lipschitz mappings satisfying for every Y Q X and x £ X \ Y , 

dHi(G'i(x),span({Gi(y)}yey)) 



d{x,Y) 



> 



C 



1 



16 + 161oe f _L5(£M£2M))L^ V ^og"" 



log 



\B{x,2d{x,yo))\ 
|S(x,d(x,yo)/512)| 



and 



d{x,Y) 



> 



/ ^ 




log ^ 


1/ logn 





\B{x,2d{x,yQ))\ \ 
\B{x,d{x,yQ) /hl2)\) 



where d{x,yo) = d{x,Y), and we used © with fi being the counting measure on X. Also, by 
Lemma 12.41 there is a Hilbert space H3 and a mapping G3 : X ^ H^ such that for every such 
y, X, yo 

dH3iG3ix),span{{G3{y)}y(zY)) . C 

d{x,Y) y/log^ ' ■'■{l^(^''^(^'S^o)/2)|<e|iJ(x,d{x,yo)/8)|}- 

Denoting H = Hi® H2® H3 and G = -^{Gi ® G2 © G3), similar argument as in the proof of 
Theorem II .41 implies that 



dH{G{x),span{{G{y)}y(zY)) 
d{x,Y) 



> n 



\J ax log n 



Now, Feige's argument (see the proof of Lemma 15 in jj^]) yields the required result. 



□ 
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Remark 2.3. Note that, by general dimension reduction techniques which preserve distance to 
affine hulls [SO], the dimension of the above embedding can be reduced to 0{k\ogn) while main- 
taining the volume-respecting property for k-point subsets. 

3 Low- dimensional embeddings of planar metrics 

In this section we refine the ideas of the previous section and prove Theorem 11.61 We say that a 
metric {X, d) is planar (resp. excludes Kg^s as a minor) if there exists a graph G = (X, E) with 
positive edge weights, such that G is planar (resp. does not admit the complete bipartite graph 
Ks^s as a minor) and •) is the shortest path metric on a subset of G. We shall obtain optimal 
low-dimensional embeddings of planar metrics into ioo by proving the following more general result. 

Theorem 3.1. Let {X,d) be an n-point metric space that excludes Kg s a minor. Then X 
embeds into i^'^ (l°g*)log") yj^ifi distortion 0{s'^). 

We will need three lemmas. The first one exhibits a family of decompositions with respect to 
a diameter bound A > 0; it follows easily from 19 , with improved constants due to Note 
that in contrast to Definition 11.11 fand also to Rao's embedding |53])! we require that x and y are 
padded simultaneously. 

Lemma 3.2. There exists a constant c such that for every metric space {X, d) that excludes K^^s 
as a minor, and for every A > 0, there exists a set of k = 2>^ partitions Pi, . . . , of X, such that 

1. For every G £ Pi, diam(C) < A. 

2. For every pair x,y G X, there exists an i such that for T = cs^ , 

B{x,/\/T) C Pi{x) and B{y,A/T) C P,(y). 

Proof. Fix a edge-weighted graph G that does not admit Kg g as a minor and whose shortest path 
distance is d{-, •). Fix also some xq G X and 5 > 0. For i £ {0, 1, 2} and j £ {0} U N define: 

4 = |x G X : 9(j -l)+3i< ^^^p^ < 9j + 3i| . 

For every i, = {j4* }j>o clearly forms a partition of X. Let us say that a subset S" C X cuts a 
subset S" C X if S" n S" 7^ and S' % S. Observe that for every x G X at most one of the sets 
{Aj : i = 0, 1, 2 ; J = 0, 1, 2, . . .} cuts B{x,5), as otherwise there exist zi,Z2 G B{x,6) for which 
d{zi, Z2) > d{xo, zi) — d{xo, Z2) > 35. Thus, for every x, y G X, for one of the partitions P", P^,P^ 
both B{x,6) and B{y,6) are contained in one of its clusters. For each cluster G of the partitions 
P^,P^,P^, consider the subgraph of G induced on the points of C, partition G into its connected 
components, and apply the above process again to each such connected component. Continuing 
this way a total of s times, we end up with 3** partitions, and in at least one of them, neither 
B(x,6) nor B{y,S) is cut. The results of E| show there exists a constant c > such that 
the diameter of each cluster in the resulting partitions is at most cs'^6, and the lemma follows by 
setting (5 = A/(cs2). □ 

We next consider a collection of such decompositions, with diameter bounds A > that are 
proportional to the integral powers of 4T. Furthermore, we need these decompositions to be nested. 
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Lemma 3.3. Let {X,d) be a metric space that excludes Kg^s as d minor, and let T = 0{s'^) be 
as in Lemma \'^.H Then for every a > there exists /c = 3* families of partitions of X, 
i = 1, . . . ,k with the following properties: 

1. For each i the partitions {Pl}u& o,re nested, i.e. P^-i is a refinement of P^ for all u. 

2. For each i, every C £ P^ satisfies diam(C) < a(4T)". 

3. For each n G Z and every pair x,y £ X , there exists an i such that, 

B{x,a{ATT/{2T)) C P^(x) and B{y,a{ATT / {2T)) C 

Proof. Let Pi,...,Pk be partitions as in Lemma 13.21 with A = a(4r)" and let Qi, ■ ■ ■ ,Qk be 
partitions as in Lemma l3?2l with A = a(4r)''"^ Fix j and C G Pj, let Sc = {A £ Qj : An 
C / 0, but yl 2 C}, and replace every C G Pj by the sets A G Sq and the set C = C \ 
^AeSc Continuing this process we replace the partition Pj by a new partition such that Qj 
is a refinement of Pj. Note that we do not alter Qj. Since diam(74) < a(4T)"~^, we have that if 
C G Pj and B{x,a{ATY/T) C C, then P(x, 2a(4r)"-i) C C' . Continuing this process inductively 
we obtain the required families of nested partitions. □ 

We next use a nested sequence of partitions {Pu\u& to form a mapping 'ij: : X ^ i^O(iog|x|)_ 

Lemma 3.4. Let {Pu}ue'i be a sequence of partitions of X that is nested (i.e. Pu-i is a refinement 
of Pu), and let m > and D > 2 be such that for all C G Pu, diam(C) < 2'^D^ . Assume further 
that Pui = {X}, Pu2 = {{x} ■ X G X}. Then for all U2 < u < ui and all A £ P^ there exists a 
mapping ip : A ^ M^riog2 1-411 ^/j^^ satisfies: 

(a) . For every x £ A and every 1 < j < 2[log2 1^1] Inhere exists u' < u for which \ip{x)j\ = 

min{(i(x,X\ P„/(a;)),2™i:>"'}, 

(b) . For allx,y£ A, \\ip{x) - 7p{y)\\ oo < 2d{x,y), 

(c) . If x,y £ A are such that for some u' < u, d{x,y) £ [2™!)"'^-'^, 2™+^!)"'^-'^) and there exists 

a cluster C £ P„/ for which x,y £ C, P(x, 2™+ip)"'-2) c P„/_i(a;) and B {y , 2""+^ D""' C 
Pu'-i{y), then mx) - ^(y)lloo > 

Proof. Proceed by induction on u. The statement is vacuous for u = U2, so we assume it holds for u 
and construct the required mapping for u + 1. Fix A £ Pu+i and assume that H = {Ai, . . . , Aj.} C 
Pu is a partition of A. By induction there are mappings ij^i : Ai ^ ]R2[iog2|Ain satisfying (a)-(c) 
above (with respect to Ai and u). 

For /i G N denote Ch = {Ai £ H : 2^~^ < \Ai\ <2^}. We claim that for every z = 1, . . . , r there 
is a choice of a string of signs cr* G {—1, l}2riog2 |An-2riog2 |A,n g^^]^ fgj. /j ^nd for ah distinct 
Ai,Aj £ Ch, a' 7^ a^. Indeed, fix h; if h > logs 1^1 then for Ai £ Ch, \Ai\ > 2^'^ > \A\/2; thus 
\Ch\ = 1 and there is nothing to prove. So, assume that h < logs 1^1 ™d note that \Ch\ < |A|/2'^~^. 
Hence, the required strings of signs exist provided 2'^^^°^2\A\']-2h > \A\/2''-'^, which is true since 
h + l< \l0g2\A\]. 

Now, for every i = 1, . . . , r define a mapping Q : Ai ^ MSfioga |An-2riog2 \A,\] 

Ci{x) = mm{d{x,X\ Ai), 2'"P'"} • a\ 
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Finally, define the mapping il) : X ^ RzriogalAll by ^1^^ = ijji® Ci- Requirement (a) holds for V 
by construction. To prove requirement (b), i.e. that tp is 2-Lipschitz, fix G A. If for some i, 
both x,y € Ai then by the inductive hypothesis ipi is 2-Lipschitz, and clearly Q is 1-Lipschitz, so 
IIV'(x) — ip{y)\\oo < 2(i(x,y). Otherwise, fix a coordinate 1 < j < 2[log2|^n use (a) to take 
u' < u such that = d{x,X \ Pu>{x))] since y ^ this is at most d{x,y). It similarly 

follows that \ip{y)j\ < d{x,y), and hence — ^ '^d{x,y). 

To prove that requirement (c) holds for ip, take x,y £ A and n' < n + 1 such that d{x, y) G 
[2™D"'-\2™+iD"'-i) and there exists a cluster C G P„' for which x.y e C, 2™+1D"'-2) C 
and 2™-+^!)" C Pu>-i{y)- The case u' < u follows by induction, so assume that 
u' = u + 1. Let i,j G {1, . . . ,r} be such that x G Aj, y G A^; then i / j, since diam(^j) < 
2"'D" < d{x,y). Assume first [log2 l^iH 7^ riog2|vlj|], and without loss of generality suppose 
[log2 l^ill < riog2 l^jll; then there is a coordinate i = 2[log2 l^il] +1 which 

\i^{x)e\ = \Ci{x)i\=mm{d{x,X\A,),2"'D^}, 

and, for some u" < u, 

my)e\ = \i^j{y)e\ = mm{d{y,X \ Pu"{y)),r'D'''"}. 

It follows that \ij{x)e\ > (since we assumed 2™+1L>"-1) C Ai), and that \^{y)i\ < 

2™!)""^, and therefore 

mx)i-i^{yh\>2^D^-'>^^. 

It remains to deal with the case [log2 l^iH = riog2 By our choice of sign sequences, in this 

case there is an index i for which cr^ 7^ ct;^, and thus, for £' = i + 2[log2 l^iH > \'>P{x)e' ~ ''Piy)e'\ = 
\i'{x)p\ + \'ip{y)e'\. Since we assumed 2™+1L»"-1) C Ai and 2"^+^£»"-i) C Aj, we get 

□ 

Finally, we prove the main result of this section by a concatenating several of the above maps 

V'. 

Proof of Theorem \3.1\ For each m G {0, 1, . . . , [log2(4cs^)] } set a = 2™, apply Lemma 13.31 to 
obtain 3* families of nested partitions {Pf^}„gz, . . . , {P3?^}ugz that satisfy the conclusion of 
Lemma 13.31 with T = cs^. For every i = 1,...,3'', let VI" be the mapping that Lemma 13.41 
yields for {P^}ugz when setting A = X and D = Acs^. Consider the map ^' = ®m,i'4^T-> which 
takes values in £00^ ( s ) s )^ Clearly is 2-Lipschitz. Moreover, for every x,y G X there is 
m G {0, 1,... , [log2(4cs2)]} and It G Z such that d{x,y) G [2™D'', 2'"+^D"). By Lemma lOl there 
is i G {1, . . . ,3*} for which B{x,2"'+^D''-^) C P„(x) and B{y,2"'+^ D""-^) C P^{y); it then follows 
using Lemma 13.41 that 

ll^^(x) - *(y)|U > livr(^) - V'r(y)lloo = nidix,y)/s^), 

as required. □ 
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